A framework for bottom-up induction of oblique decision trees

نویسندگان

  • Rodrigo C. Barros
  • Pablo A. Jaskowiak
  • Ricardo Cerri
  • André Carlos Ponce de Leon Ferreira de Carvalho
چکیده

Decision-tree induction algorithms are widely used in knowledge discovery and data mining, specially in scenarios where model comprehensibility is desired. A variation of the traditional univariate approach is the so-called oblique decision tree, which allows multivariate tests in its non-terminal nodes. Oblique decision trees can model decision boundaries that are oblique to the attribute axes, whereas univariate trees can only perform axis-parallel splits. The vast majority of the oblique and univariate decision-tree induction algorithms employs a top-down strategy for growing the tree, relying on an impurity-based measure for splitting nodes. In this paper, we propose BUTIF— a novel Bottom-Up Oblique Decision-Tree Induction Framework. BUTIF does not rely on an impurity-measure for dividing nodes, since the data resulting from each split is known a priori. For generating the initial leaves of the tree and the splitting hyperplanes in its internal nodes, BUTIF allows the adoption of distinct clustering algorithms and binary classifiers, respectively. It is also capable of performing embedded feature selection, which may reduce the number of features in each hyperplane, thus improving model comprehension. Different from virtually every top-down decision-tree induction algorithm, BUTIF does not require the further execution of a pruning procedure in order to avoid overfitting, due to its bottom-up nature that does not overgrow the tree. We compare distinct instances of BUTIF to traditional univariate and oblique decision-tree induction algorithms. Empirical results show the effectiveness of the proposed framework.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Global Induction of Oblique Decision Trees: An Evolutionary Approach

A new evolutionary algorithm for induction of oblique decision trees is proposed. In contrast to the classical top-down approach, it searches for the whole tree at the moment. Specialized genetic operators are developed, which enable modifying both the tree structure and the splitting hyper-planes in non-terminal nodes. The problem of over-fitting can be avoided thanks to suitably defined fitne...

متن کامل

Global Induction of Oblique Model Trees: An Evolutionary Approach

In this paper we propose a new evolutionary algorithm for global induction of oblique model trees that associates leaves with multiple linear regression models. In contrast to the typical top-down approaches it globally searches for the best tree structure, splitting hyperplanes in internal nodes and models in the leaves. The general structure of proposed solution follows a typical framework of...

متن کامل

Global Induction of Decision Trees: From Parallel Implementation to Distributed Evolution

In most of data mining systems decision trees are induced in a top-down manner. This greedy method is fast but can fail for certain classification problems. As an alternative a global approach based on evolutionary algorithms (EAs) can be applied. We developed Global Decision Tree (GDT) system, which learns a tree structure and tests in one run of the EA. Specialized genetic operators are used,...

متن کامل

Oblique Decision Tree Learning Approaches - A Critical Review

Decision tree classification techniques are currently gaining increasing impact especially in the light of the ongoing growth of data mining services. A central challenge for the decision tree classification is the identification of split rule and correct attributes. In this context, the article aims at presenting the current state of research on different techniques for classification using ob...

متن کامل

Real Boosting a la Carte with an Application to Boosting Oblique Decision Tree

In the past ten years, boosting has become a major field of machine learning and classification. This paper brings contributions to its theory and algorithms. We first unify a well-known top-down decision tree induction algorithm due to [Kearns and Mansour, 1999], and discrete AdaBoost [Freund and Schapire, 1997], as two versions of a same higher-level boosting algorithm. It may be used as the ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Neurocomputing

دوره 135  شماره 

صفحات  -

تاریخ انتشار 2014